ECON 707/807: Econometrics II

Course Introduction

Evie Zhang

Old Dominion University

Time Series

Sequential measurements or values of a single entity over time.

Why study time series data?

  • Gross Domestic Product
  • Unemployment
  • Vehicle Demand
  • Energy Consumption

Forecasting

“Predict” outcomes of a time series in future (unobserved) periods.

Time Series Models

\[Y_{t} = f(Y_{t-1}, X_{t}, X_{t-1}, \tau_t, C_t, S_t)\] where:

  • \(Y_{t-1}\) is a lagged value of \(Y\)
  • \(X_t\) is a contemporaneous independent value
  • \(X_{t-1}\) is a lagged value of \(X\)
  • \(\tau\) is the trend
  • \(C\) is the cycle
  • \(S\) is the season

Statistics Review

  • mean
  • variance
  • standard error
  • central limit theorem (CLT)
  • p-value
  • \(Y_i = \alpha + \beta X_i + \epsilon_{i}\)
  • causality
  • spurious

Forecasting Steps

  1. Define Problem

    What am I trying to solve?

  2. Gather Data

    FRED, WRDS, Etc.

  3. Exploratory Data Analysis (EDA)

    Plot, Plot, Plot

  4. Choose & Fit a Model

    \(Y_{t} = f(Y_{t-1}, X_{t}, X_{t-1}, \tau_t, C_t, S_t)\)

  5. Evaluate, Forecast

Packages and Functions

  • table()
  • aggregate()
  • match()
  • paste(), paste0(), substr(), grepl(), gsub(), regexpr(), strsplit(), unlist()

Packages and Functions

  • for()
  • if(), else
  • ifelse()
  • fixest
  • scales::alpha()
  • data.table::fread()
  • readRDS(), saveRDS()

Packages and Functions

  • lubridate1
  • tsibble
  • forecast
  • ts()2
  • duplicated()

Install and Load Packages

Code
install.packages("lubridate")
library("lubridate")

Make a Time Series

Code
urate <- read.csv("../data/unrate_us.csv")
colnames(urate) <- c("date", "urate_t")
head(urate)
        date urate_t
1 1948-01-01     3.4
2 1948-02-01     3.8
3 1948-03-01     4.0
4 1948-04-01     3.9
5 1948-05-01     3.5
6 1948-06-01     3.6

Plot a Time Series

Code
urate$date <- ymd(urate$date)
plot(urate$date, urate$urate_t,
     xlab = "Month",
     ylab = "Unemployment Rate",
     main = "Monthly Unemployment Rate")

Plot a Time Series

Code
plot(urate$date, urate$urate_t,
     ylab = "Unemployment Rate",
     xlab = "Time",
     main = "Monthly Unemployment Rate",
     type = "l", col = "dodgerblue")

Lags (Leads)

        date urate_t
1 1948-01-01     3.4
2 1948-02-01     3.8
3 1948-03-01     4.0
4 1948-04-01     3.9
5 1948-05-01     3.5
6 1948-06-01     3.6
Code
t <- 2:nrow(urate)
urate$urate_tm1 <- NA
urate$urate_tm1[t] <- urate$urate_t[t-1]
urate$urate_tp1 <- NA
urate$urate_tp1[t-1] <- urate$urate_t[t]
head(urate)
        date urate_t urate_tm1 urate_tp1
1 1948-01-01     3.4        NA       3.8
2 1948-02-01     3.8       3.4       4.0
3 1948-03-01     4.0       3.8       3.9
4 1948-04-01     3.9       4.0       3.5
5 1948-05-01     3.5       3.9       3.6
6 1948-06-01     3.6       3.5       3.6

What about panels?

Code
df <- data.frame(state = c(rep("NY", 10),
                           rep("VA", 10),
                           rep("CA", 10)),
                 year = rep(2010:2019, 3),
                 var = rnorm(30))
df[df$year %in% 2010:2012,]
   state year         var
1     NY 2010  0.98176529
2     NY 2011 -0.04575189
3     NY 2012  1.73884110
11    VA 2010 -1.23379611
12    VA 2011 -0.97220151
13    VA 2012 -1.20317155
21    CA 2010  0.06060990
22    CA 2011  0.68533755
23    CA 2012  0.48815305

What about panels?

Code
df$var_lag1 <- NA
   state year         var var_lag1
1     NY 2010  0.98176529       NA
2     NY 2011 -0.04575189       NA
3     NY 2012  1.73884110       NA
11    VA 2010 -1.23379611       NA
12    VA 2011 -0.97220151       NA
13    VA 2012 -1.20317155       NA
21    CA 2010  0.06060990       NA
22    CA 2011  0.68533755       NA
23    CA 2012  0.48815305       NA

What about panels?

Code
df$var_lag1 <- NA
t <- which(duplicated(df$state))
df$var_lag1[t] <- df$var[t-1]
df[df$year %in% 2010:2012,]
   state year         var    var_lag1
1     NY 2010  0.98176529          NA
2     NY 2011 -0.04575189  0.98176529
3     NY 2012  1.73884110 -0.04575189
11    VA 2010 -1.23379611          NA
12    VA 2011 -0.97220151 -1.23379611
13    VA 2012 -1.20317155 -0.97220151
21    CA 2010  0.06060990          NA
22    CA 2011  0.68533755  0.06060990
23    CA 2012  0.48815305  0.68533755

Multiple Plots

Code
plot(df$year, df$var)

Multiple Plots

Code
plot(df$year, df$var,
     col = match(df$state, unique(df$state)))

Multiple Plots

Code
plot(df$year, df$var,
     col = match(df$state, unique(df$state)),
     type = "l")

Multiple Plots

Code
lim <- df$state == "NY"
plot(df$year[lim], df$var[lim],
     col = match(df$state[lim], unique(df$state)),
     type = "l")

Multiple Plots

Code
lim <- df$state == "NY"
plot(df$year[lim], df$var[lim],
     col = match(df$state[lim], unique(df$state)),
     type = "l")
lim <- df$state == "VA"
lines(df$year[lim], df$var[lim],
     col = match(df$state[lim], unique(df$state)))
lim <- df$state == "CA"
lines(df$year[lim], df$var[lim],
     col = match(df$state[lim], unique(df$state)))

Multiple Plots

Code
plot(df$year, df$var, type = "n")
lim <- df$state == "NY"
lines(df$year[lim], df$var[lim],
     col = match(df$state[lim], unique(df$state)))
lim <- df$state == "VA"
lines(df$year[lim], df$var[lim],
     col = match(df$state[lim], unique(df$state)))
lim <- df$state == "CA"
lines(df$year[lim], df$var[lim],
     col = match(df$state[lim], unique(df$state)))

Next Class

  1. \(E[y]\), \(\hat{y}\)
  2. Loss Functions
  3. Lags, Leads
  4. Conditional Forecasts
  5. Time Series Components

Practice - Cleaning Data

  1. Work on #5.

  2. Work on #8.

Practice - Social Security

  1. Navigate to the following link: here
  2. Download the state-specific data.
  3. Read in the data from Virginia.
    • Make some plots for a name of your choosing (time series, distributions, etc.)
  4. Read in all of the data
    • Combine into one data.frame.
    • What state is your chosen name “most” popular?
    • What year is your chosen name “most” popular?
    • Analyze the “gender neutrality” of a relatively unisex name (e.g. “Alex”).
    • Analyze a common name and it’s nickname (e.g. “Alex” vs “Alexander”).